Bus systems and methods for controlling data flow in a field of processing elements

ABSTRACT

A bus system for a configurable architecture and methods therefor are provided in which optimization of the configuration efficiency and reconfiguration efficiency are taken into account separately. A system and method may include controlling data transmission by: transmitting, by a first hardware element and to a second hardware element, a data packet conditional upon and/or responsive to the second hardware element&#39;s assignment of a signal to a connecting bus via which the data packet is transmitted, where the signal indicates that no incoming data packet can be lost. A system and method may include controlling data transmission by: transmitting, by a first hardware element and to a second hardware element, a first data packet and subsequently a second data packet; and receiving, by the first hardware element and from the second hardware element, an acknowledgement of the first data packet subsequent to the transmittal of the second data packet.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of and claims priority to U.S. patentapplication Ser. No. 10/504,684, filed on Jul. 14, 2006, which is theNational Stage of International Patent Application Serial No.PCT/DE2003/000489, filed on Feb. 18, 2003, the entire contents of eachof which are expressly incorporated herein by reference.

FIELD OF THE INVENTION

The present invention relates to methods and embodiments of bus systemsfor configurable architectures. More specifically, embodiments of thepresent invention relate to configuration optimization andreconfiguration efficiency.

BACKGROUND INFORMATION

A reconfigurable architecture is understood to refer to modules (VPUs)having a configurable function and/or interconnection, in particularintegrated modules having a plurality of arithmetic and/or logic and/oranalog and/or storing and/or interconnecting modules (hereinafterreferred to as PAEs) arranged in one or more dimensions and/orcommunicative peripheral modules (IO) interconnected directly or via oneor more bus systems. PAEs may be of any embodiment or mixture andarranged in any hierarchy. This arrangement is referred to below as aPAE array or PA.

Generic modules of this type include systolic arrays, neural networks,multiprocessor systems, processors having multiple arithmetic unitsand/or logic cells, interconnecting and network modules such as crossbarswitches, as well as known modules of the types FPGA, DPGA, XPUTER, etc.In this context, reference is made in particular to the followingpatents and applications by the present applicant: P 44 16 881.0-53, DE197 81 412.3, DE 197 81 483.2, DE 196 54 846.2-53, DE 196 54 593.5-53,DE 197 04 044.6-53, DE 198 80 129.7, DE 198 61 088.2-53, DE 199 80312.9, PCT/DE 00/08169, DE 100 36 627.9-33, DE 100 28 397.7, DE 101 10530.4, DE 101 11 014.6, PCT/EP 00/10516, EP 01 102 674.7,PCT/DE97/02949, PCT/DE97/02998, PCT/DE97/02999, PCT/DE98/00334,PCT/DE99/00504, PCT/DE99/00505, PCT/EP02/10065, PCT/DE00/01869,PCT/DE02/03278, PCT/EP02/02403, PCT/DE03/00152, DE 102 06 857.7, DE 10240 000.8, PCT/EP02/02402, DE 02 027 277.9, and EP 01 129 923.7, whichare hereby incorporated by reference herein in their entireties.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the two processing elements, a bus, and their IDs.

FIG. 2 shows a bus segment with double switches for controlling dataflow between bus segments, according to an example embodiment of thepresent invention.

FIGS. 3 a-3 c show a bus system in various states of configuration, andthe use of switches for connecting an input of a processing element,according to an example embodiment of the present invention.

FIGS. 4 a-4 c show a bus system in various states of configuration, andthe use of switches and RdyHold stages for connecting an output of aprocessing element, according to an example embodiment of the presentinvention.

FIGS. 5 a-5 c show examples of processing elements with differentlyconfigured interconnections, and signal propagation in the case ofbranching or loops.

FIG. 6 a shows a conventional bus design.

FIG. 6 b shows a bus design according to an example embodiment of thepresent invention.

FIG. 7 shows different types of connections between busses using eitherone switch or two switches and using configuration bits to determine thestates of the switches, according to example embodiments of the presentinvention.

FIGS. 8 a and 8 b illustrate how to respond to a SyncReconfig before aconfiguration is not yet completely configured, according to exampleembodiments of the present invention.

FIG. 9 illustrates an architecture implementing a RDY/ACK protocolaccording to an example embodiment of the present invention.

FIG. 10 illustrates a modified architecture implementing a RDY/ACKprotocol according to an example embodiment of the present invention.

FIG. 11 illustrates an architecture including a double receiver inputregister, implementing a transmitter/receiver protocol according to anexample embodiment of the present invention.

FIG. 12 illustrates a modified architecture implementing a protocolbetween a transmitter and receiver, where all modules have registers atthe output, according to an example embodiment of the present invention.

FIG. 13 illustrates an architecture implementing a RDY-ABLE protocolaccording to an example embodiment of the present invention.

FIG. 14 shows a bus signal between a transmitter and a receiver usingcredit system timing.

FIG. 15 shows a bus signal between a transmitter and a receiver using aRDY protocol.

FIG. 16 shows a bus signal where a pulsed RDY-ABLE protocol is used.

FIG. 17 shows hardware for receiving and sending data using a RDY/ABLEprotocol according to an example embodiment of the present invention.

FIG. 18 illustrates an example interface arrangement of AMBA for acontrol manager (CM) interface of a unit having an XPP core according toan example embodiment of the present invention.

FIG. 19 shows an internal structure of a receiver part in an externalinterface for a 16-bit output port of the CM according to an exampleembodiment of the present invention.

FIG. 20 shows an internal structure of a transmitter part of an externalmodule that establishes an interface connection with the 16-bit inputport of the configuration manager according to an example embodiment ofthe present invention.

DETAILED DESCRIPTION

The architecture indicated above is used as an example for illustrationand is referred to below as a VPU. This architecture has any number ofarithmetic or logic cells (including memories) and/or memory cellsand/or interconnection cells and/or communicative/peripheral (IO) cells(PAEs), which may be arranged to form a one-dimensional ormultidimensional matrix (PA), which may have different cells of anyconfiguration. Bus systems are also understood to be cells. The matrixas a whole or parts thereof are assigned a configuration unit (CT),which influences the interconnection and function of the PA.Improvements are still possible with such architectures, e.g., withregard to the procedure and/or speed of reconfiguration.

1. Structure of Bus Systems

Conventional implementation of configuration requires synchronizationbetween the objects. Objects are understood to refer to all dataprocessing modules (PAEs) and, inasmuch as necessary, also the datatransferring modules such as bus systems. This synchronization isimplemented centrally, e.g., via a FILMO (see PCT/DE97/02998,PCT/DE97/02999, PCT/DE99/00504, PCT/DE99/00505, and PCT/EP01/06703).Therefore, at least as many cycles elapse between the end of an oldconfiguration (reconfig trigger; see PCT/DE98/00334) and the beginningof a new configuration (object again enters the “configured” state) aswould correspond to the length of the pipelined CM bus (forward andreturn; see PCT/EP01/06703).

Two methods for accelerating this procedure, according to exampleembodiments of the present invention, provide, respectively, that:

-   -   a) the required sequence is ensured by additional logic in the        objects, e.g., management of IDs; and    -   b) the objects are modified so that it is no longer necessary to        take the sequence into account and instead the proper        interconnection is ensured by the architecture of the objects.

For the following considerations, the modules present in a typicalreconfigurable architecture are divided into two groups: buses andobject. The buses group includes the connecting line between twosegments. It is represented by the segment switch at one end. The objectgroup includes all the objects which have a connection to a bus and/orcommunicate with its environment, i.e., any PAE (e.g., memory, ALUs),IO, etc.

Typically there are dependencies mainly among all directly adjacentobjects, including specifically bus to bus, object to bus, and object toobject. With respect to bus to bus, a bus is represented by the segmentswitch at the end of a bus. With respect to object to bus, the object isto be selected freely from FREG, BREG, ALU and RAM. Everything that hasa connection is likewise counted as an object in this sense. Withrespect to object to object, these are not usually directly adjacent andthere is normally a bus in between. There is then no dependence. In thecase of a direct connection, the connection behaves according to “bus tobus” and/or “object to bus” depending on the embodiment.

1.1 Bus-to-Bus Dependence

In the related art, longer buses are configured from back to front, forexample. An example of the bus design described below is illustrated inFIG. 6 a. The last bus segment (0606 a) is configured with an open busswitch (0607) while all others are configured with a closed bus switch.The sequence must be preserved to prevent data from running from a busthat is closer to the front to a bus that is closer to the rear whichstill belongs to another configuration.

1.2 Bus-to-Object Dependence

According to the related art, an object (e.g., 0601, 0602, 0603) may notbe configured until it is ascertained that the buses (0606 a, 0606 b)used by the object have already been configured. This dependence alsoexists to ensure that no data is running into a foreign configuration(PAE output) and/or is taken from a foreign configuration (PAE input).

In summary, it may be concluded that there is always a dependence whenan object establishes, has established, and/or wishes to establish aconnection to another object. This takes place by way of the connectionmask (0608) which controls the connection of the object inputs and/oroutputs onto the buses (e.g., via multiplexers, transmission gates andthe like; see also PCT/EP02/02403, FIGS. 5 and 7 c) and/or closed busswitches (0607) which permit the transfer of information via a bus(e.g., from one segment (0606 a[1]) to another segment (0606 a[2]). Inother words, this connection mask indicates which horizontal busstructure is connected to which vertical bus structure and where thisoccurs; the fact that a “lane change” to a horizontal bus structure, forexample, is also possible should be mentioned for the sake ofthoroughness. The connection must not be established until it isascertained that the object to which the connection is to be establishedalready belongs to the same configuration, i.e., has already beenconfigured accordingly.

2. Control Over ID Management

A first approach, according to an example embodiment of the presentinvention, is to store the ID or array ID currently being used by theobject in each object (see PCT/DE99/00504 and PCT/DE99/00505).Therefore, information regarding which task and/or configuration theparticular object is being assigned to at the moment is stored. As soonas a connection between two objects is configured (e.g., between a PAEoutput and a bus), a check is performed in advance to determine whetherboth objects have the same ID/array ID. If this is not the case, theconnection must not be established. Thus, a connection is activatedand/or allowed, depending on a comparison of identifying information.

Although this method is basically comparatively trivial, it requires agreat hardware complexity, because, for each possible connection,registers are required for storing the IDs/array IDs and comparators arerequired for comparing the IDs/array IDs of the two objects to beconnected.

FIG. 1 shows the two PAEs (0101, 0102) together with their IDs and a bus(0103) with its ID. Each PAE/bus connection is checked via thecomparators (0104, 0105). The figure is used only to illustrate thebasic principle without being restrictive. If all resources(inputs/outputs of the PAEs, buses) are taken into account, there is aconsiderable increase in complexity and the associated hardwareexpenditure. A method according to the present invention which isimplemented much more favorably from a technical standpoint and istherefore preferred is discussed in the following sections.

3. Control Over the Interconnection Structure

FIG. 2 shows a bus segment needed by configurations A and B. However, itis still occupied by configuration A, as shown here. Configuration B mayalready occupy the two neighboring bus segments independently thereof.According to the present invention, through the new double bus switches(0201 and 0202, corresponding to 0607 and, according to FIG. 6 b, 0609),the possibility may be ruled out that data from configuration B willinterfere with the data flow of configuration A. Likewise, no data runsfrom configuration A to B. In the case of configuration B, it is assumedthat configuration A has been correctly implemented and that the busswitch at the output is open.

As soon as configuration A is concluded, the bus thus released isoccupied by configuration B and configuration B begins to work.

In other words, one basic principle of the method is that each elementinvolved in a data transmission connects itself automatically to thecorresponding data source and/or the data transmitter, i.e., it has thecontrol itself of which data transmitter/receiver it is to be connectedto according to the configuration.

Bus to PAE Input

FIG. 3 shows a PAE input (0301) which is to be connected to the twolower buses of the three buses shown here. The vertical switchescorrespond to a simple connection switch of the connection mask (0608)for connection to the bus and are managed by the PAE (0302), and inaddition, the horizontal switches (0303, corresponding to 0610) are alsoconfigured via the bus to ensure a correct connection.

The middle bus in FIG. 3 a is still occupied by another configuration.Nevertheless, the object may be configured completely using the PAEinput. Data from the middle bus cannot run unintentionally into theobject because this is prevented by the configuration of the bus (switch0303).

In FIG. 3 b, the old configuration has been terminated and replaced bythe new configuration. Now both buses are available. To determine whichbuses are in fact connected, only the vertical switches (0302) are used.

Finally, the upper bus in FIG. 3 c is occupied by a third configuration,which would also like to use the PAE input shown. Therefore, the bus isconfigured so that data may be withdrawn at this point. However, thishas no effect on the object because the PAE configuration does notprovide any connection at this point. The connection is thus notestablished until the configuration of the PAE input changes.

Bus-PAE Output

This is a connection in which the use of two separate switches isparticularly preferred. It may be preferable in the (two) other cases toimplement the functionality with one switch which is controlled by twoconfig bits which are interlinked by Boolean logic, preferably by an ANDlink, to determine the switch state. FIG. 4 shows a PAE output which isto be connected to the two lower buses of the three buses illustratedhere. The object is configured independently of the availability of thebuses, the switches on the left in the figure corresponding to theconnection mask.

The middle bus (0401) in FIG. 4 a is still occupied by anotherconfiguration. Now a data packet may be sent from the output register tothe connection. It is stored in the connected RdyHold (seePCT/EP02/02403) stages. The packet may not be transmitted through theopened switch of the middle bus and thus also may not be acknowledged,i.e., the transmitter does not receive an acknowledgment of receipt.Thus, the object may not transmit any further data packets with theusual protocols.

Now in FIG. 4 b the middle bus has been reconfigured, i.e., the switchclosed, so that data may again be transmitted here. A packet that haspossibly already been stored is now on the bus; otherwise everythingfunctions like before.

In FIG. 4 c the top bus (0402) is requested by a third configuration.The switch on the bus side behind the RM remains open accordingly,because data transfer is to be prevented on the bus side. Here again,everything otherwise behaves like before.

Result

The reconfiguration performance may be increased substantially withrelatively simple hardware. In particular, it is thus even more possibleto preload multiple complete configurations into the objects because theobjects may then be configured individually per object and independentlyaccording to the prevailing data processing status of each without anyproblems being expected.

After arrival of the reconfiguration signal requesting reconfiguration,each object until it is configured again needs locally only as manycycles as configuration words are necessary when transmission ofconfiguration words in cycles is assumed. The reconfiguration time maybe pushed further by using a second register set, approximately towardzero cycles, when configurations are predeposited in the second registerset.

In an optimized implementation that is preferred according to thepresent invention, the additional hardware complexity for buses and PAEinputs may be limited to one additional configuration bit and one ANDgate per bus switch and per number of buses H number of PAE inputs. Thisis depicted in FIG. 7.

FIG. 7 a shows a left-hand bus (0606 a[1]) connected to a right-hand bus(0606 a[2]) via the bus switch. A configuration switch is assigned toeach bus switch, indicating whether the switch is configured as beingopen or closed (c[1] for the left-hand bus and c[2] for the right-handbus). In FIG. 7 b the same function is implemented by a single switchinstead of two switches. The two configuration bits c[1] and c[2] arelogically linked together by an AND gate (&) so that the single switchis closed only when both configuration bits in this example are logicb′1. Alternatively, an implementation via an OR gate is appropriate whena logic b′0 is to display a closed switch.

The PAE outputs may optionally require slightly more complexity,depending on the implementation, if an additional switch is consideredto be necessary for each. In this connection, it should be pointed outthat although it is possible to provide the connection to and/or betweenall objects according to the present invention, this is by no meansobligatory. Instead, it is possible to implement embodiments of thepresent invention only in some objects.

FIG. 6 b shows as an example a design of an object and a bus accordingto the present invention. The basic design corresponds to the relatedart according to FIG. 6 a and/or according to PCT/EP02/02403, FIGS. 5and 7 c. Therefore, only the elements in FIG. 6 b that are novel incomparison with the related art will be described here. The switches onbus ends 0609 are inserted according to an example embodiment of thepresent invention, so the buses are completely separable by switches0607 and 0609. Switches (0610) at the inputs and outputs of the objects(PAEs), regulating the correct connections to the buses, are also novel.

A basic principle now is that each object and/or each bus independentlyregulates, i.e., determines, which connections are to be establishedand/or remain in effect at the moment. It should be pointed out herethat this determination is performed by the individual object and/or busdepending on the configuration, i.e., it is by no means arbitrary.Management of the connections is thus more or less delegated to theobjects involved. Each bus may regulate which other buses it will beconnected to via switches 0607 and 0609 according to the configuration.No bus may now be connected to another (e.g., via 0607) without theother bus allowing this through a corresponding switch setting of itsbus switches (e.g., 0609).

It should be pointed out explicitly that switch 0607 according to therelated art could also be situated at the output of a bus and switch0609 is added at the input of the buss accordingly.

Switches 0610 are preferably also double switches, one switch beingcontrolled by the PAE object and the other switch being controlled bythe particular bus system 0606 a and/or 0606 b. It should be pointed outin particular that one switch is merely indicated with dashed lines.This is the switch controlled by bus 0606 a and/or 0606 b and it may beimplemented “virtually” by the setting of the connection mask (0608).

5. Reconfiguration Control

Control of the reconfiguration is triggered in the VPU technology bysignals (Reconfig) which are usually propagated with the data packetsand/or trigger packets over the bus systems and indicate that a certainresource may or should be reconfigured and, if necessary, the newconfiguration is selected at the same time (see PCT/DE98/00334 andPCT/DE00/01869).

If a reconfigurable module is to be only partially reconfigured, thenReconfig must be interrupted at certain locations according to thealgorithm. This interruption, which prevents forwarding of Reconfig, isreferred to as ReconfigBlock.

ReconfigBlocks are usually introduced at the boundary of oneconfiguration with the next to separate them from one another.

Different strategies for sending Reconfig signals are selected asrequested by the algorithm.

Now three example embodiments of the present invention will bedescribed. These embodiments may be used individually and/or combinedand they have different behaviors. It is regarded as inventive incomparison with the related art that it is possible to select betweensuch embodiments in pairs.

a) ForcedReconfig: The simplest strategy is to send the Reconfig signalvia all interfaces of an object, i.e., it propagates along the datapaths and/or trigger paths belonging to a certain configuration whileother configurations remain unaffected. This ensures that allinterconnected objects in the PA receive the signal. For the sake ofrestriction, the signal must be blocked at suitable locations. Thismethod, i.e., signal, ensures that a configuration is removedcompletely. The signal is referred to below as ForcedReconfig. Thissignal should be used only after all data in the particular objects havebeen processed and removed because there is no synchronization with dataprocessing. Although all objects belonging to a certain configurationwithin an array are thus forced to allow reconfiguration, otherconfigurations running simultaneously on other objects of the same arrayremain unaffected.

b) SyncReconfig: A Reconfig is sent together with the corresponding dataand/or triggers. It is sent only together with active data packetsand/or trigger packets. The signal is preferably relayed together withthe last data packet and/or trigger packet to be processed and indicatesthe end of the data processing after this data/trigger packet. In anexample embodiment, if a PAE requires multiple cycles for processing,the forwarding of SyncReconfig is delayed until the trigger packetand/or data packet has in fact been sent. This signal is thussynchronized with the last data processing. As described below, thissynchronized reconfiguration according to the present invention may beblocked at certain locations.

c) ArrayReset: ArrayReset may be used as an extension of ForcedReconfigwhich cannot be blocked and results in reconfiguration of the completearray. This method is particularly appropriate when, for example, anapplication is terminated or an illegal opcode (see PCT/DE03/00152)and/or timeout of a configuration has occurred and proper termination ofthe configuration cannot be ensured with other strategies. This isimportant for a power-on reset, or the like, in particular.

5.1 SyncReconfig

When SyncReconfig is propagated, it always contains valid active data ortriggers.

Problems occur when, in the case of branching, the signal is propagatedonly in the active branch (FIG. 5 a) or when branching or combining isblocked due to lack of data and/or triggers (FIG. 5 b).

To solve this problem, the semantics of SyncReconfig is defined asfollows. The signal indicates that after receiving and completelyprocessing the data/triggers, all the data/trigger sources (sources) andbuses leading to the input of an object which has received theSyncReconfig signal are reconfigured. A ReconfigEcho signal may beintroduced for this purpose. After the arrival of SyncReconfig at adestination object, a ReconfigEcho is generated by it, preferably onlyand as soon as the destination object has completely processed the dataarriving with the SyncReconfig signal. This generated ReconfigEcho isthen sent to all sources connected to the object, i.e., its inputs, andresults in reconfiguration, i.e., reconfigurability of the sourcesand/or the bus systems transmitting data and/or triggers.

If an object receives a ReconfigEcho, this signal is transmitted furtherupstream, i.e., it is transmitted via the buses to its sources via allthe inputs having bus switches still closed. After being generated,ReconfigEcho is thus sent to the data and/or trigger sources that feedinto an object, and the signals are forwarded from there.

Inputs/outputs that have already received a SyncReconfig preferablybecome passive due to its arrival, i.e., they no longer execute anydata/trigger transfers. Depending on the embodiment, a SyncReconfig mayonly induce passivation of the input at which the signal has arrived orpassivation of all inputs of the PAE.

A ReconfigEcho usually arrives at the outputs of PAEs. This causes theReconfigEcho to be relayed via the inputs of the PAE if they have notalready been passivated by a received SyncReconfig.

In some cases, e.g., in FIGS. 5 a through 5 c, ReconfigEcho may alsooccur at the inputs. This may result in passivation of the input atwhich the signal arrived, depending on the embodiment, or in a preferredembodiment it may trigger passivation of all inputs of the PAEs.

5.2 Trigger having Reconfig Semantics

In some cases (e.g., FIG. 5 b), an implicit propagation of the Reconfigsignals (in particular SyncReconfig, ReconfigEcho) is impossible.

For the required explicit transmission of any Reconfig signals, thetrigger system according to PCT/DE98/00334 may be used, to which end thetrigger semantics is extended accordingly. Triggers may thus transmitany status signals and control signals (e.g., carry, zero, overflow,STEP, STOP, GO; see PCT/DE98/00334, PCT/DE00/01869, and PCT/EP02/02403),as well as the implicit Reconfig signals. In addition, a trigger mayassume the SyncReconfig, ReconfigEcho, or ForcedReconfig semantics.

5.3 Blocking

At each interface which sends a SyncReconfig, it is possible to setwhether sending or relaying is to take place. Suppressing propagationresults in stopping a reconfiguration wave that would otherwisepropagate over the array and/or the configuration affected by it.However, regardless of the blockades to be set up for certain locationsduring configuration in a self-modifying or data-dependent manner and/orunder or for certain conditions, data and/or trigger signals maycontinue to run over a blocked position, in order to be processedfurther as before, as provided with the protected configuration and/or aprotected configuration part.

If necessary, it would also be possible to locally suppress the responseto the reconfiguration request, i.e., to ignore the reconfigurationrequest locally but nevertheless send a signal indicative of the arrivalof a locally ignored reconfiguration request signal to downstreamobjects, whether blocked or unblocked.

As a rule, however, when individual objects of a configuration are to beblocked, it is preferable to send the reconfiguration request signalover separate buses, bus segments or lines to downstream objects past ablocking object. The normally preferred case in which thereconfiguration request signal must penetrate into the object is theneasier to maintain, i.e., not only peripherally relayed in forward orreverse registers, if provided, and thus sent past the actual cell. Itis then preferable that, in the case of blocking of a reconfigurationrequest signal (or a certain reconfiguration request signal of aplurality of differentiable reconfiguration request signals), thisblocked reconfiguration request signal “dies” in the particular object,i.e., is not to be forwarded.

If the acceptance of SyncReconfig at the receiving interface is blocked,then the receiving object switches the interface receiving SyncReconfigto passive (i.e., the interface no longer sends and/or receives anydata); otherwise, the object does not respond to the signal but it maysend back the ReconfigEcho to permit the release of the transmitting bussystem.

In addition, it is possible to block ReconfigEcho either independentlyof and/or jointly with a ReconfigBlock.

5.4 Effect of SyncReconfig and ForcedReconfig on Bus Systems

To ensure that, after transmission of a SyncReconfig over a bus, nosubsequent data and/or triggers, which originate from a followingconfiguration, for example, and would thus be processed incorrectly, aretransmitted, SyncReconfig preferably blocks the sending of the handshakesignals RDY/ACK (see PCT/DE97/02949), which indicate the presence ofvalid data on the bus and control the data transmission, over the bus.The bus connections per se, i.e., the data and/or trigger network, arenot interrupted to permit resending of ReconfigEcho over the bus system.The bus is dismantled and reconfigured only with the transmission ofReconfigEcho.

In other more general terms, according to an example embodiment of thepresent invention, the occurrence of SyncReconfig first prevents dataand/or triggers from being relayed over a bus—except forReconfigEcho—e.g., by blocking the handshake protocols and ReconfigEchosubsequently induces the release and reconfiguration of the bus.

Other methods having an equivalent effect may be used. For example, dataand trigger connections may be interrupted even in a run-through ofSyncReconfig, whereas the ReconfigEcho connection is dismantled only onoccurrence of ReconfigEcho.

This ensures that data and triggers of different configurations which donot belong together will not be exchanged incorrectly via theconfigurations.

FIG. 5 shows an example of PAEs (0501) having differently configuredinterconnections. The following transmissions are defined: data and/ortrigger buses (0502), SyncReconfig (0503), and ReconfigEcho (0504). Inaddition, ReconfigBlock (0505) is also shown. 0506 indicates thatSyncReconfig is not relayed.

FIG. 5 a illustrates a branching such as that which may occur, forexample due to an IF-THEN-ELSE construct in a program. After a PAE, thedata is branched into two paths (0510, 0511), only one of which isalways active. In the case depicted here, a last data packet istransmitted together with SyncReconfig, and branch 0510 is not activeand therefore does not relay the data and does not relat SyncReconfig.Branch 0511 is active and relays the data and SyncReconfig. According toan example embodiment of the present invention, the transmitting bussystem is switched to inactive immediately after the transmission and isthen able to transmit back only ReconfigEcho. PAE 0501 b receivesSyncReconfig and sends it to PAE 0501 c, which sends ReconfigEcho backto 0501 a, whereupon 0501 a and the bus system between 0501 a and 0501 bare reconfigured. The transmission between 0501 b and 0501 c takes placeaccordingly.

0501 e has also received SyncReconfig from 0501 a but the branch is notactive. Therefore, 0501 e does not respond, i.e., 0501 e does not sendSyncReconfig to 0501 f; nor does it send the ReconfigEcho back to 0501a.

0501 c processes the incoming data and forwards SyncReconfig to 0501 d.This sequence initially corresponds to the transmission from 0501 a to0501 b. After processing the data, 0501 d generates a ReconfigEcho whichis also sent to 0501 f because the branches are combined. Although 0501f has not performed a data operation, the unit is reconfigured and sendsthe ReconfigEcho to 0501 e which is then also reconfigured—without newdata processing having taken place.

ReconfigEcho transmitted from 0501 b to 0501 a may also be transmittedin a preferred embodiment to 0501 e where it arrives at an input. Thisresults in passivation of the input and in passivation of all inputs inan expanded embodiment, which may also be reconfigurable.

To impart a local character to the examples in FIG. 5, theinputs/outputs in the diagrams have been provided with a ReconfigBlockso that the forwarding of SyncReconfig and ReconfigEcho is suppressed.

FIG. 5 b is largely identical to FIG. 5 a which is why the samereferences are also being used. The right-hand path is again active andthe left-hand path is inactive. The essential difference is that insteadof combining the paths at 0501 d, the paths now remain open and leaddirectly to the peripheral interface, for example. In such cases, it ispossible and preferable to provide an explicit wiring of ReconfigEchovia trigger lines (0507) between the PAEs (0501 i and 0501 j).

FIG. 5 c shows the exemplary embodiment of a loop. This loop runs overPAEs 0501 m, . . . , 0501 r. The transmissions between PAEs 0501 m, . .. , 0501 r are evidently equivalent here according to the precedingdiscussion, in particular regarding the transmissions between 0501 b and0501 c.

The transmission between 0501 r and 0501 m deserves special attention.In an example embodiment, when ReconfigEcho appears at 0501 m, the bus(0508) between 0501 m and 0501 r is reconfigured by the transmission ofReconfigEcho. ReconfigEcho is blocked at the output of 0501 r.Therefore, 0501 r is not reconfigured but the particular output isswitched to passive on arrival of ReconfigEcho, i.e., 0501 r no longersends any results on the bus. Therefore, the bus may be used by anyother configuration.

As soon as 0501 r receives ReconfigEcho from 0501 q, 0501 r isreconfigured at the end of the data processing. The ReconfigBlock and/orthe passivation of the bus connection to 0501 m (0508) preventsforwarding toward 0501 m. Meanwhile, 0501 m and 0508 may be used byanother configuration.

6.0 SyncReconfig II

Another optional method for controlling the SyncReconfig protocol isdescribed below. This method may be preferred, depending on theapplication, the area of use, and/or embodiment of the semiconductor orsystem.

This method is defined as follows:

1. SyncReconfig is transmitted in principle over all connected buses ofa PAE (data buses and/or trigger buses), even over the buses which arenot currently (in the current cycle) transmitting any data and/ortriggers.

2. In order for a PAE to relay SyncReconfig according to paragraph 1,first all the connected inputs of the PAE must have receivedSyncReconfig.

2a. Feedback in the data structure (e.g., loops) requires an exceptionto the postulate according to paragraph 2. Feedback coupling isexcepted, i.e., it is sufficient if all the connected inputs of a PAEexcept those in a feedback loop have received SyncReconfig so that it isforwarded.

3. If a PAE is processing data (under some circumstances even inmultiple cycles, e.g., division), then a SyncReconfig (if this isapplied to the inputs according to 2 and 2a) is relayed to thereceiver(s) at the point in time when the calculation and forwarding ofthe data and/or triggers is completed. In other words, SyncReconfig doesnot overtake data processing.

4. If a PAE is not processing any data (e.g., because no data is queuedup at the inputs and/or there is no corresponding trigger for enablingdata processing (see PCT/DE98/00334)) but it has received SyncReconfigat all configured inputs, then the PAE forwards SyncReconfig via allconfigured outputs. No data processing takes place (there is noqueued-up input data and/or enable trigger (PCT/DE98/00334)), andaccordingly no data is transmitted further. In other words: PAEs thatare not processing data relay SyncReconfig further immediately to theconnected receivers but with the cycles synchronized, if necessary.

SyncReconfig is preferably transmitted together with handshake signals(e.g., RDY/ACK=reaDY/ACKnowledge). A PAE sending a SyncReconfig does notenter the reconfigurable state until all receivers have acknowledgedreceipt of SyncReconfig for confirmation by an ACK(nowledge).

In this method, the basic question arises as to what happens when aconfiguration is not yet completely configured but is already to bereconfigured again. Apart from the consideration as to whether suchbehavior of an application does not require better programming, theproblem is solved as follows: if a PAE attempts to forward SyncReconfigto a PAE that is not yet configured, it will not receive an ACK untilthe PAE is configured and acknowledges SyncReconfig. This might resultin a loss of performance because of waiting until the configuration ofthe configuration to be deleted is completed before deleting it. On, theother hand, however, this is a very rare case which occurs only underunusual circumstances.

FIG. 8 a shows a basic method to be used according to an exampleembodiment of the present invention. SyncReconfig 0805 arrives at PAE0806, which forwards a signal at the end of data processing togetherwith data 0807. Connections that have been configured but not usedduring the data processing also forward the data (0808).

Although SyncReconfig arrives from 0806 via 0807 in the case of PAE0809, SyncReconfig is still outstanding for the second input. Therefore,0809 does not forward SyncReconfig. PAE 0810 receives SyncReconfig via0808 but does not receive any data. Via the second input, PAE 0810likewise receives a SyncReconfig. Although no data processing is takingplace in PAE 0810 (the data via 0808 is still outstanding), PAE 0810relays SyncReconfig without any result data.

FIG. 8 b shows the processing of a loop. During the data processing,data is fed back (0824) from PAE 0822 to PAE 0821. At 0821, aSyncReconfig arrives via 0820. This is relayed to the downstream PAEs inthe loop as far as PAE 0822. PAE 0822 relays (0823) SyncReconfig todownstream PAEs not belonging to the loop. Neither SyncReconfig nor datais transmitted via loop feedback 0824 (see explanation 0803).

0801 means that no SyncReconfig has been transmitted on this bus at thepoint in time depicted as an example. 0801 implies no informationregarding whether data/triggers have been transmitted.

0802 means that a SyncReconfig has been transmitted on this bus at thepoint in time depicted as an example. 0802 does not imply any statementregarding whether data/triggers have been transmitted.

0803 means that in the case of occurrence of a SyncReconfig at the datatransmitter (in this example 0822), no SyncReconfig is transmitted onthis bus (regardless of the point in time). 0802 implies that nodata/triggers are transmitted.

7. Alternative Protocoling

A protocol according to an example embodiment of the present inventionis described below as an alternative to the known RDY/ACK data flowcontrol protocol. It secures data streams even when registers areinserted between the transmitter and receiver at high clock frequencies.To this end, suitable hardware modules are also provided.

Reusable transmitter and receiver units are extracted for these modules,in particular for the communication between an XPP processor field andan XPP configuration controller. These modules and their code are alsodescribed below. It should be pointed out that these modules may in partreplace and/or supplement XPP-FILMO modules such as those which havebeen used previously.

The architecture using the RDY/ACK protocol is shown in FIG. 9.

The transmitter must wait for pending ACKs before a RDY signal isassigned. This means that the longest path which determines thefrequency of such a system is the path from the receiver to thetransmitter, specifically via the logic of the transmitter and back tothe receiver and its register enable logic.

An inserted register at the input of the transmitter, as shown in FIG.10, shortens the longest path, but the logic must wait one cycle longerfor pending ACKs. The data transmission rate is reduced to every secondclock cycle. This is also true when the pipeline register is notprovided at the ACK input, but instead at the RDY and data output.

A second problem occurs when the protocol is used on the PINS or the I/Ointerface of an XPU. The XPU may be correctly configured and may send adata packet outward. This means that it sends a RDY. Under theassumption that the connected circuit is not in a position to receivedata because it is not connected or is not completely programmed, theRDY will be lost and the XPU will be stopped. Later when the connectedcircuit outside of the XPU is in a position to receive data, it will notrespond because it will not send an ACK without having received a RDY.

8. First Approach Using the Credit FIFO Principle

The Credit FIFO idea, according to example embodiments of the presentinvention, solves the problem of the reduced throughput with a FIFO inthe receiver input. The transmitter is always allowed to send anotherpacket if at least one ACK is pending.

This means that when the transmission begins the first time, two packetsare sent without knowing whether or not they will be confirmed(acknowledged). Thus, the second problem mentioned in the precedingsection may still exist.

FIG. 12 shows one alternative according to an example embodiment of thepresent invention. The protocol between the transmitter and receiver isthe same but all modules have registers in the outputs as a designvariant. This is useful for synthesis estimates and time responseestimates. The latter architecture does not require more hardware thanthe former because a data register must also be present in the formervariant.

According to an alternative example embodiment of the present invention,the semantics of the ACK signal is changed to the meaning of “wouldissue an ACK,” i.e., it shows the ability to receive data. Therefore,these signals are called “ABLE” signals. FIG. 5 shows the version inwhich there are registers at all module outputs.

The transmitter may always send data in the direction of the receiver ifallowed by the ABLE signal. This protocol may then disable the secondregister in the receiver part if it is certain that the transmitter isholding the transmitted data in a stable stall situation until thereceiver signals “ABLE” again.

9.1 Protocol Evaluation-Credit System Semantics

The credit system has the following semantics:

Transmitter: “I am allowed here to send two data packets and as manyadditional packets as I receive acknowledgments for. If I am not allowedto send another packet, then the last data value must remain valid onthe BUS.”

Receiver: “Each received packet will be acknowledged as soon as I amable to receive others.”

9.2 RDY-ABLE Semantics

The RDY-ABLE protocol has the following semantics:

Transmitter: “If the ABLE signal is ‘high,’ I am allowed to send a datapacket which is also valid, with a ready signal being on the connectionbus during the entire next cycle. If the ABLE signal is ‘low,’ then Imust ensure that the instantaneous data will remain on the bus foranother cycle.”

Receiver: “ABLE will always be assigned to the connecting bus for theentire next cycle if I am certain that no incoming data packet is lost.”

There may be a number of variants for implementing the RDY-ABLEprotocol, e.g., pulsed RDY-ABLE or RDY-ABLE having pulsed data. Themeanings of high and low may be the opposite of those described above.For pulse-like protocols, each data packet must be valid for only onecycle. This variant needs one more input register in the receiver andmay be useful if the bus between the transmitter and receiver is used bymore than one connection or possibly is used bidirectionally. Certain IOadditions to XPU architectures may be some examples of this.

Comparison

In situations where the number of credits is not known to thetransmitter, the credit system is more stable, whereas RDY-ABLE has theadvantage that data is not sent until the receiver is in a position toreceive data. RDY has an ACK-time curve with a credit system. FIG. 6shows the bus signal between a transmitter and a receiver in a creditsystem having RDY/ACK protocol. Five cases are outlined below:

-   -   1. transmission of a single packet;    -   2. streaming;    -   3. receiver is not immediately ready to receive;    -   4. receiver is able to receive only at the beginning; and    -   5. receiver is not ready to receive additional data, e.g.,        because it has not been reconfigured or it is unable to supply        any additional data to the next receiver.

2.5 FIG. 15 shows the bus signal between a transmitter and a receiverusing the RDY protocol.

Four cases are outlined:

-   -   1. transmission of a single packet “I am allowed while ABLE is        active”;    -   2. streaming is consistently high during ABLE;    -   3. the transmitter transmits regardless of the ability of the        receiver to receive data; and    -   4. the transmitter stops the flow for one cycle.

To make the communication bus free for other users more frequently, thepulsed RDY-ABLE protocol may be used. However, it is not the standardwhen simpler hardware is desired because it increases the hardwarecomplexity by the addition of one register. Reference may be made toFIG. 16 for the comparison.

The hardware for RDY (FIG. 17) includes a general module which has atransmitter part and a transmitter part for data using the RDY protocol.A specific module may insert its required data processing hardwarebetween the transmitter and the receiver unit. If the central part ofFIG. 7 is omitted, then the local RDY, ABLE and data signals fitdirectly on top of one another on the transmitter and receiver units.The resulting module—just one transmitter and one receiver unit—isuseful in a pipeline stage where many of these modules may be usedbetween a real data producing module and a data-using module. This isuseful when a transmitter and a receiver are to be connected over agreat distance without having to reduce the frequency or throughput.

A module must contain not only a receiver and a transmitter, but in manycases multiple receivers and one or more transmitters will be providedin one module, e.g., and arithmetic logic unit or a dual-ported RAM.This is advisable when data is generated in different ways or when datais received via another protocol. Examples may include configurablecounters (without receivers) or displays (no forwarding).

Insertion of Simple Registers:

According to an example embodiment of the present invention, if the busmust have simple register stages between the transmitter and receiver,then the receiver must be increased by two registers per inserted stage.An example for this need is to provide register stages at chipboundaries, e.g., connection pieces provided with registers.

Addendum

Receiver and transmitter for AMBA interfaces:

FIG. 18 illustrates one possible interface arrangement of AMBA for theCM interface of a unit having an XPP core.

For external units with the CM interface of an XPP core, the use of twomodules is recommended.

FIG. 19 shows the internal structure of the receiver part which isrequired in the external interface for the 16-bit output port of theconfiguration manager, according to an example embodiment of the presentinvention.

The reception of data functions as follows: when the receiver moduledisplays a 1 (HIGH) on recv_valid, then data has been received and it isinstantaneously available at the recv_data output. If the surroundingmodule is able to receive this data, it assigns a 1 (HIGH) to recv_able.The data is then available only until the end of the same cycle. Thedata received next is then presented, if available.

For some circuits it may be beneficial to use the recv_rdy signal whichshows that data is currently being taken from the receiver. It is an ANDlogic result from recv_valid and recv_able.

Transmitters in External Units

FIG. 20 shows the internal structure of the transmitter part which is tobe part of the external module that establishes an interface connectionwith the 16-bit input port of the configuration manager, according to anexample embodiment of the present invention. A conventional 43-bit codeword input of a CM (configuration manager) may also expect this inputexternally. Both versions may be available in a simulation environment.

If this module and the XPP are directly connected, the signals send_reqand n_back may both be set at 0 (LOW). The n_back and n_oe are not used.Data is transmitted as follows: When the transmitter module shows a 1(HIGH) at send able, the send_rdy signal may be set at 1 (HIGH) namelywith valid data at the send_data input. All this takes place in the samecycle. If new data is available in the next cycle, the send_rdy may beset again at 1 (HIGH). Otherwise, it is to be enabled. Send_data neednot be valid in any cycle in which send_rdy is 0 (LOW).

1-8. (canceled)
 9. A data transmission controlling method comprising:transmitting, by a first hardware element and to a second hardwareelement, a data packet at least one of conditional upon and responsiveto the second hardware element assigning a signal to a connecting busvia which the data packet is transmitted, the signal indicating that noincoming data packet can be lost.
 10. A data transmission controllingmethod, comprising: transmitting, by a first hardware element and to asecond hardware element, a first data packet and subsequently a seconddata packet; and receiving, by the first hardware element and from thesecond hardware element, an acknowledgement of the first data packetsubsequent to the transmittal of the second data packet.